Comparing holy texts? How the Bible looks different in two different translations
Eugen G Tarnow February 4 2016 09:55:22 AM
By Eugen Tarnow, Ph.D.Avalon Business Systems, Inc.
http://AvalonAnalytics.com
It is tempting to use Natural Language Processing to compare holy texts across religions as was recently done by a colleague of Avalon Analytics. But there is a need to be careful because there is much riding on the details.
Below I show you what a word frequency comparison looks like of the SAME text (Bible from http://unbound.biola.edu/index.cfm?method=downloads.showDownloadMain ) but in different versions. The top panel displays the "Basic English" version and the bottom panel displays the "American Standard" version. To make the comparison nothing has been done to the text except removal of "stopwords" (words that are too common to convey understanding. List is taken from the NLTK library).
I note two severe discrepancies. In one version "shall" is extremely important but in the other it is nowhere to be found; in one version "give" is important but in the other version - it can't be found!
Figure 1. Word frequency plots of the Bible. Top panel is the Basic English version, the bottom panel displays the American Standard version.
Comments Disabled